A Visual Tracking Framework for Intent Recognition in Videos

نویسندگان

  • Alireza Tavakkoli
  • Richard Kelley
  • Christopher King
  • Mircea Nicolescu
  • Monica N. Nicolescu
  • George Bebis
چکیده

To function in the real world, a robot must be able to understand human intentions. This capability depends on accurate and reliable detection and tracking of trajectories of agents in the scene. We propose a visual tracking framework to generate and maintain trajectory information for all agents of interest in a complex scene. We employ this framework in an intent recognition system that uses spatio-temporal contextual information to recognize the intentions of agents acting in different scenes, comparing our system with the state of the art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Retrieval for Tv Show Videos by Associating Audio Speaker Recognition Result to Visual Faces∗

Person retrieval based on solely visual face recognition is hard because of the well known problems of illumination, pose, size and expression variation, which can exceed those due to identity. Fortunately, videos often accompanied with other modalities, like audio, text, etc. In this paper, we propose a framework to associate who and when information provided by speaker recognition result to t...

متن کامل

Towards Real-Time Detection and Tracking of Basketball Players using Deep Neural Networks

Online multi-player detection and tracking in broadcast basketball videos are significant challenging tasks. In this environments, the target distributions are highly non-linear, and the varying number of objects creates complex interactions with overlap and ambiguities. In this paper, we present a real-time multi-person detection and tracking framework that is able to perform detection and tra...

متن کامل

Nostril Detection for Robust Mouth Tracking

Within an Audio-Visual Speech Recognition (AVSR) framework an important process is video feature extraction. Several methods are available, but all of them require mouth region extraction. To achieve this, a semi-automatic system based on nostril detection is presented. The system is designed to work on ordinary frontal videos and to be able to recover brief nostril occlusion. Using the nostril...

متن کامل

An Agile Framework for Real-time Visual Tracking in Videos

We present an agile framework for automated tracking of moving objects in full motion video (FMV). The framework is robust, being able to track multiple foreground objects of different types (e.g., person, vehicle) having disparate motion characteristics (like speed, uniformity) simultaneously in real time under changing lighting conditions, background, and disparate dynamics of the camera. It ...

متن کامل

Hierarchical Attentive Recurrent Tracking

Class-agnostic object tracking is particularly difficult in cluttered environments as target specific discriminative models cannot be learned a priori. Inspired by how the human visual cortex employs spatial attention and separate “where” and “what” processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical attentive recurrent model for single object ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008